57 research outputs found

    A generic method for assignment of reliability scores applied to solvent accessibility predictions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score.</p> <p>Results</p> <p>An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output.</p> <p>Conclusion</p> <p>The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.</p

    CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles

    Get PDF
    CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 A when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server i

    Norgal: Extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data

    Get PDF
    A.docx-document with full results and detailed benchmarking between Norgal and MITOBim and NOVOPlasty. Section S1: Full Norgal output of subset of test data. Section S2: Extraction of chloroplast from Vittis vinifera (Grape vine). Section S3: Benchmarking against other methods. Section S4: Mitochondrial test data sets. (DOCX 1485 kb

    Detection of mobile genetic elements associated with antibiotic resistance in Salmonella enterica using a newly developed web tool: MobileElementFinder

    Get PDF
    Objectives - Antimicrobial resistance (AMR) in clinically relevant bacteria is a growing threat to public health globally. In these bacteria, antimicrobial resistance genes are often associated with mobile genetic elements (MGEs), which promote their mobility, enabling them to rapidly spread throughout a bacterial community. Methods - The tool MobileElementFinder was developed to enable rapid detection of MGEs and their genetic context in assembled sequence data. MGEs are detected based on sequence similarity to a database of 4452 known elements augmented with annotation of resistance genes, virulence factors and detection of plasmids. Results - MobileElementFinder was applied to analyse the mobilome of 1725 sequenced Salmonella enterica isolates of animal origin from Denmark, Germany and the USA. We found that the MGEs were seemingly conserved according to multilocus ST and not restricted to either the host or the country of origin. Moreover, we identified putative translocatable units for specific aminoglycoside, sulphonamide and tetracycline genes. Several putative composite transposons were predicted that could mobilize, among others, AMR, metal resistance and phosphodiesterase genes associated with macrophage survivability. This is, to our knowledge, the first time the phosphodiesterase-like pdeL has been found to be potentially mobilized into S. enterica. Conclusions - MobileElementFinder is a powerful tool to study the epidemiology of MGEs in a large number of genome sequences and to determine the potential for genomic plasticity of bacteria. This web service provides a convenient method of detecting MGEs in assembled sequence data. MobileElementFinder can be accessed at https://cge.cbs.dtu.dk/services/MobileElementFinder/

    MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    Get PDF
    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets

    Meta-genomic analysis of toilet waste from long distance flights; a step towards global surveillance of infectious diseases and antimicrobial resistance

    Get PDF
    Human populations worldwide are increasingly confronted with infectious diseases and antimicrobial resistance spreading faster and appearing more frequently. Knowledge regarding their occurrence and worldwide transmission is important to control outbreaks and prevent epidemics. Here, we performed shotgun sequencing of toilet waste from 18 international airplanes arriving in Copenhagen, Denmark, from nine cities in three world regions. An average of 18.6 Gb (14.8 to 25.7 Gb) of raw Illumina paired end sequence data was generated, cleaned, trimmed and mapped against reference sequence databases for bacteria and antimicrobial resistance genes. An average of 106,839 (0.06%) reads were assigned to resistance genes with genes encoding resistance to tetracycline, macrolide and beta-lactam resistance genes as the most abundant in all samples. We found significantly higher abundance and diversity of genes encoding antimicrobial resistance, including critical important resistance (e.g. bla(CTX-M)) carried on airplanes from South Asia compared to North America. Presence of Salmonella enterica and norovirus were also detected in higher amounts from South Asia, whereas Clostridium difficile was most abundant in samples from North America. Our study provides a first step towards a potential novel strategy for global surveillance enabling simultaneous detection of multiple human health threatening genetic elements, infectious agents and resistance genes

    NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    Get PDF
    UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences

    Intra-articular vs. systemic administration of etanercept in antigen-induced arthritis in the temporomandibular point. Part I: histological effects

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Temporomandibular joint (TMJ) arthritis in children causes alterations in craniomandibular growth. This abnormal growth may be prevented by an early anti-inflammatory intervention. We have previously shown that intra-articular (IA) corticosteroid reduces TMJ inflammation, but causes concurrent mandibular growth inhibition in young rabbits. Blockage of TNF-α has already proven its efficacy in children with juvenile idiopathic arthritis not responding to standard therapy. In this paper we evaluate the effect of IA etanercept compared to subcutaneous etanercept in antigen-induced TMJ-arthritis in rabbits on histological changes using histomorphometry and stereology. This article presents the data and discussion on the anti-inflammatory effects of systemic and IA etanercept. In Part II the data on the effects of systemic and IA etanercept on facial growth are presented.</p> <p>Methods</p> <p>Forty-two rabbits (10 weeks old) pre-sensitized with ovalbumin and locally induced inflammation in the temporomandibular joints were divided into three groups: a placebo group receiving IA saline injections in both joints one week after arthritis induction (n = 14), an IA etanercept group receiving 0.1 mg/kg etanercept per joint one week after arthritis induction (n = 14) and a systemic etanercept group receiving 0.8 mg/kg etanercept weekly throughout the 12-week study (n = 14). Arthritis was maintained by giving four inductions three weeks apart. Additional IA saline or etanercept injections were also given one week after the re-inductions. Histomorphometric and unbiased stereological methods (optical fractionator) were used to assess and estimate the inflammation in the joints.</p> <p>Results</p> <p>The histomorphometry showed synovial proliferation in all groups. The plasma cell count obtained by the optical fractionator was significantly reduced when treating with systemic etanercept but not with IA etanercept. Semi-quantitative assessments of synovial proliferation and subsynovial inflammation also showed reduced inflammation in the systemic etanercept group. However, the thickness of the synovial lining and volume of the subsynovial connective tissue showed no differences between the groups.</p> <p>Conclusion</p> <p>An anti-inflammatory effect of systemic etanercept on the synovial tissues in the temporomandibular joint was shown. However, IA etanercept at the given dose had no significant effect on the severity of chronic inflammation on the parameters here tested in ovalbumin antigen-induced arthritis.</p

    Proficiency testing of virus diagnostics based on bioinformatics analysis of simulated in silico high-throughput sequencing data sets

    Get PDF
    Quality management and independent assessment of high-throughput sequencing-based virus diagnostics have not yet been established as a mandatory approach for ensuring comparable results. The sensitivity and specificity of viral high-throughput sequence data analysis are highly affected by bioinformatics processing using publicly available and custom tools and databases and thus differ widely between individuals and institutions. Here we present the results of the COMPARE [Collaborative Management Platform for Detection and Analyses of (Re-) emerging and Foodborne Outbreaks in Europe] in silico virus proficiency test. An artificial, simulated in silico data set of Illumina HiSeq sequences was provided to 13 different European institutes for bioinformatics analysis to identify viral pathogens in high-throughput sequence data. Comparison of the participants’ analyses shows that the use of different tools, programs, and databases for bioinformatics analyses can impact the correct identification of viral sequences from a simple data set. The identification of slightly mutated and highly divergent virus genomes has been shown to be most challenging. Furthermore, the interpretation of the results, together with a fictitious case report, by the participants showed that in addition to the bioinformatics analysis, the virological evaluation of the results can be important in clinical settings. External quality assessment and proficiency testing should become an important part of validating high-throughput sequencing-based virus diagnostics and could improve the harmonization, comparability, and reproducibility of results. There is a need for the establishment of international proficiency testing, like that established for conventional laboratory tests such as PCR, for bioinformatics pipelines and the interpretation of such results

    Prediction of Disease Causing Non-Synonymous SNPs by the Artificial Neural Network Predictor NetDiseaseSNP.

    Get PDF
    We have developed a sequence conservation-based artificial neural network predictor called NetDiseaseSNP which classifies nsSNPs as disease-causing or neutral. Our method uses the excellent alignment generation algorithm of SIFT to identify related sequences and a combination of 31 features assessing sequence conservation and the predicted surface accessibility to produce a single score which can be used to rank nsSNPs based on their potential to cause disease. NetDiseaseSNP classifies successfully disease-causing and neutral mutations. In addition, we show that NetDiseaseSNP discriminates cancer driver and passenger mutations satisfactorily. Our method outperforms other state-of-the-art methods on several disease/neutral datasets as well as on cancer driver/passenger mutation datasets and can thus be used to pinpoint and prioritize plausible disease candidates among nsSNPs for further investigation. NetDiseaseSNP is publicly available as an online tool as well as a web service: http://www.cbs.dtu.dk/services/NetDiseaseSN
    • …
    corecore